Efficient Stochastic Gradient Descent for Strongly Convex Optimization
نویسندگان
چکیده
We motivate this study from a recent work on a stochastic gradient descent (SGD) method with only one projection (Mahdavi et al., 2012), which aims at alleviating the computational bottleneck of the standard SGD method in performing the projection at each iteration, and enjoys an O(log T/T ) convergence rate for strongly convex optimization. In this paper, we make further contributions along the line. First, we develop an epoch-projection SGD method that only makes a constant number of projections less than log 2 T but achieves an optimal convergence rate O(1/T ) for strongly convex optimization. Second, we present a proximal extension to utilize the structure of the objective function that could further speed-up the computation and convergence for sparse regularized loss minimization problems. Finally, we consider an application of the proposed techniques to solving the high dimensional large margin nearest neighbor classification problem, yielding a speed-up of orders of magnitude.
منابع مشابه
Accelerated Stochastic Gradient Descent for Minimizing Finite Sums
We propose an optimization method for minimizing the finite sums of smooth convex functions. Our method incorporates an accelerated gradient descent (AGD) and a stochastic variance reduction gradient (SVRG) in a mini-batch setting. Unlike SVRG, our method can be directly applied to non-strongly and strongly convex problems. We show that our method achieves a lower overall complexity than the re...
متن کاملAccelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation
Asynchronous algorithms have attracted much attention recently due to the crucial demands on solving large-scale optimization problems. However, the accelerated versions of asynchronous algorithms are rarely studied. In this paper, we propose the “momentum compensation” technique to accelerate asynchronous algorithms for convex problems. Specifically, we first accelerate the plain Asynchronous ...
متن کاملVariants of RMSProp and Adagrad with Logarithmic Regret Bounds
Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show √ T -type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for...
متن کاملFast Rates for Online Gradient Descent Without Strong Convexity via Hoffman's Bound
Hoffman’s classical result gives a bound on the distance of a point from a convex and compact polytope in terms of the magnitude of violation of the constraints. Recently, several results showed that Hoffman’s bound can be used to derive stronglyconvex-like rates for first-order methods for convex optimization of curved, though not strongly convex, functions, over polyhedral sets. In this work,...
متن کاملA Linearly-Convergent Stochastic L-BFGS Algorithm
We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1304.5504 شماره
صفحات -
تاریخ انتشار 2013